Learning in Games: Robustness of Fast Convergence

نویسندگان

  • Dylan J. Foster
  • Zhiyuan Li
  • Thodoris Lykouris
  • Karthik Sridharan
  • Éva Tardos
چکیده

We show that learning algorithms satisfying a low approximate regret property experience fast convergence to approximate optimality in a large class of repeated games. Our property, which simply requires that each learner has small regret compared to a (1 + ✏)-multiplicative approximation to the best action in hindsight, is ubiquitous among learning algorithms; it is satisfied even by the vanilla Hedge forecaster. Our results improve upon recent work of Syrgkanis et al. [28] in a number of ways. We require only that players observe payoffs under other players’ realized actions, as opposed to expected payoffs. We further show that convergence occurs with high probability, and show convergence under bandit feedback. Finally, we improve upon the speed of convergence by a factor of n, the number of players. Both the scope of settings and the class of algorithms for which our analysis provides fast convergence are considerably broader than in previous work. Our framework applies to dynamic population games via a low approximate regret property for shifting experts. Here we strengthen the results of Lykouris et al. [19] in two ways: We allow players to select learning algorithms from a larger class, which includes a minor variant of the basic Hedge algorithm, and we increase the maximum churn in players for which approximate optimality is achieved. In the bandit setting we present a new algorithm which provides a “small loss”-type bound with improved dependence on the number of actions in utility settings, and is both simple and efficient. This result may be of independent interest.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast Convergence in Population Games

A stochastic learning dynamic exhibits fast convergence in a population game if the expected waiting time until the process comes near a Nash equilibrium is bounded above for all sufficiently large populations. We propose a novel family of learning dynamics that exhibits fast convergence for a large class of population games that includes coordination games, potential games, and supermodular ga...

متن کامل

Two Novel Learning Algorithms for CMAC Neural Network Based on Changeable Learning Rate

Cerebellar Model Articulation Controller Neural Network is a computational model of cerebellum which acts as a lookup table. The advantages of CMAC are fast learning convergence, and capability of mapping nonlinear functions due to its local generalization of weight updating, single structure and easy processing. In the training phase, the disadvantage of some CMAC models is unstable phenomenon...

متن کامل

An Improved Particle Swarm Optimizer Based on a Novel Class of Fast and Efficient Learning Factors Strategies

The particle swarm optimizer (PSO) is a population-based metaheuristic optimization method that can be applied to a wide range of problems but it has the drawbacks like it easily falls into local optima and suffers from slow convergence in the later stages. In order to solve these problems, improved PSO (IPSO) variants, have been proposed. To bring about a balance between the exploration and ex...

متن کامل

The Impact of Playing Word Games on Young Iranian EFL Learners’ Vocabulary Learning and Retention

Acquiring adequate vocabulary in a foreign language is very important but often difficult. Considering the importance of learner’s vocabulary learning and retention, the present study aimed at examining the impact of playing word games on young Iranian EFL learners’ vocabulary learning and retention at Irandoostan language Institute in Tabriz. To that end, 50 female learners at the age range of...

متن کامل

SIZE AND GEOMETRY OPTIMIZATION OF TRUSSES USING TEACHING-LEARNING-BASED OPTIMIZATION

A novel optimization algorithm named teaching-learning-based optimization (TLBO) algorithm and its implementation procedure were presented in this paper. TLBO is a meta-heuristic method, which simulates the phenomenon in classes. TLBO has two phases: teacher phase and learner phase. Students learn from teachers in teacher phases and obtain knowledge by mutual learning in learner phase. The suit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016